CHAPTER 3 Getting Statistical: A Short Review of Basic Statistics 35

»

» The histogram’s y-axis represents the number (or frequency) of individuals in

the data that fall in the numerical ranges (known as classes) of the value being

charted, which are listed across the x-axis. In this case, the y-axis would

represent number of states falling in each class.»

» This histogram’s x-axis represents classes, or numerical ranges of the value

being charted, which is in this case is number of airports.

We first made a histogram of the census, then we took four random samples of 20

states and made a histogram of each of the samples. Figure 3-1 shows the results.

As shown in Figure 3-1, when comparing the sample distributions to the distribu-

tion of the population using the histograms, you can see there are differences.

Sample 2 looks much more like the population than Sample 4. However, they are

all valid samples in that they were randomly selected from the population. The

samples are an approximation to the true population distribution. In addition, the

mean and standard deviation of the samples are likely close to the mean and stan-

dard deviation of the population, but not equal to it. (For a refresher on mean and

standard deviation, see Chapter  9.) These characteristics of sampling error  —

where valid samples from the population are almost always somewhat different

than the population — are true of any random sample.

Digging into probability distributions

As described in the preceding section, samples differ from populations because of

random fluctuations. Because these random fluctuations fall into patterns,

FIGURE 3-1:

Distribution of

number of

private and public

airports in

2011 in the

population (of

50 states and the

District of

Columbia), and

four different

samples of

20 states from the

same population.

© John Wiley & Sons, Inc.